Testing k-Wise Independence over Streaming Data
نویسندگان
چکیده
Following on the work of Indyk and McGregor [5], we consider the problem of identifying correlations in data streams. They consider a model where a stream of pairs (i, j) ∈ [n] arrive, giving a joint distribution (X,Y ). They find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close X and Y are to being independent. We extend their main result to higher dimensions, where a stream of m k-dimensional vectors in [n] arrive, and we wish to approximate the `2 distance between the joint distribution and the product of the marginal distributions in a single pass. Our analysis gives a randomized algorithm that is a (1± ) approximation (with probability 1 − δ) that requires space logarithmic in n and m and proportional to 3.
منابع مشابه
Measuring k-Wise Independence of Streaming Data under L2 Norm
Measuring independence and k-wise independence is a fundamental problem that has multiple applications and it has been the subject of intensive research during the last decade (see, among others, the recent work of Batu, Fortnow, Fischer, Kumar, Rubinfeld and White [11] and of Alon, Andoni, Kaufman, Matulef, Rubinfeld and Xie [2] ). In the streaming environment, this problem was first addressed...
متن کاملAMS Without 4-Wise Independence on Product Domains
In their fundamental work, Alon, Matias and Szegedy [3] presented celebrated sketching techniques and showed that 4-wise independence is sufficient to obtain good approximations. The question of what random functions are necessary is fundamental for streaming algorithms (see, e.g., Cormode and Muthukrishnan [9].) We present a somewhat surprising fact: on product domain [n], the 4-wise independe...
متن کاملTesting Non-uniform k-Wise Independent Distributions over Product Spaces
A discrete distribution D overΣ1 × · · · × Σn is called (non-uniform) k-wise independent if for anyset of k indexes{i1, . . . ,ik} and for any z1 ∈ Σi1 , . . . , zk ∈ Σik ,PrX∼D[Xi1 · · ·Xik = z1 · · · zk] =PrX∼D[Xi1 = z1] · · ·PrX∼D[Xik = zk]. We study the problem of testing (non-uniform) k-wiseindependent distributions over product spaces. For the uniform case ...
متن کاملTesting non-uniform k-wise independent distributions
A distribution D over Σ1 × · · · × Σn is called (non-uniform) k-wise independent if for any set of k indices {i1, . . . , ik} and for any z1 · · · zk ∈ Σi1 × · · · × Σik , PrX∼D[Xi1 · · ·Xik = z1 · · · zk] = PrX∼D[Xi1 = z1] · · ·PrX∼D[Xik = zk]. We study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the uniform case we show an upper bound on the ...
متن کاملRobust characterizations of k-wise independence over product spaces and related testing results
A discrete distribution D over Σ1 × · · · × Σn is called (non-uniform) k-wise independent if for any subset of k indices {i1, . . . , ik} and for any z1 ∈ Σi1 , . . . , zk ∈ Σik , PrX∼D[Xi1 · · ·Xik = z1 · · · zk] = PrX∼D[Xi1 = z1] · · ·PrX∼D[Xik = zk]. We study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the uniform case we show an upper bound...
متن کاملTesting k-wise independent distributions
A probability distribution over {0, 1} is k-wise independent if its restriction to any k coordinates is uniform. More generally, a discrete distribution D over Σ1 × · · · × Σn is called (non-uniform) k-wise independent if for any subset of k indices {i1, . . . , ik} and for any z1 ∈ Σi1 , . . . , zk ∈ Σik , PrX∼D[Xi1 · · ·Xik = z1 · · · zk] = PrX∼D[Xi1 = z1] · · ·PrX∼D[Xik = zk]. k-wise indepen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009